Skip to content

Add DoS protection and idle timeout to vMCP session manager#4047

Draft
yrobla wants to merge 1 commit intomainfrom
issue-3874
Draft

Add DoS protection and idle timeout to vMCP session manager#4047
yrobla wants to merge 1 commit intomainfrom
issue-3874

Conversation

@yrobla
Copy link
Copy Markdown
Contributor

@yrobla yrobla commented Mar 9, 2026

Summary

Implements resource-exhaustion protections for the session-scoped backend lifecycle (SessionManagementV2), resolving issue #3874.

Global session limit: new session requests (no Mcp-Session-Id header) receive HTTP 503 with a Retry-After header when the server-wide cap is reached. Default: 100 sessions.

Per-client session limit: CreateSession enforces a maximum number of concurrent sessions per auth.Identity.Subject. Anonymous clients are exempt. The counter is rolled back on all failure paths. Default: 10 sessions per identity.

Idle session timeout: a background reaper goroutine terminates sessions that have had no CallTool activity for longer than the configured threshold. The idle clock resets on every tool call and is initialised when a session is fully established. The reaper is wired into shutdownFuncs so it stops cleanly on server shutdown. Default: 5 min.

All three limits are configurable via Config fields; zero values select the defaults. The Limits struct is passed to sessionmanager.New() and all existing call sites are updated.

Fixes #3874

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Does this introduce a user-facing change?

No

Special notes for reviewers

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 9, 2026
@yrobla yrobla requested a review from Copilot March 9, 2026 11:55
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 88deb977d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pkg/vmcp/server/server.go Outdated
Comment thread pkg/vmcp/server/sessionmanager/session_manager.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements resource-exhaustion protections for SessionManagementV2 by adding configurable session limits and idle-session cleanup to reduce risk of backend connection/memory/FD exhaustion.

Changes:

  • Add per-client (identity subject) session limiting and idle-session tracking/reaping to sessionmanager.Manager.
  • Add server-level global session limit middleware returning HTTP 503 + Retry-After for new sessions when capped.
  • Extend/adjust unit tests to cover per-client limits and idle reaping behaviors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
pkg/vmcp/server/sessionmanager/session_manager.go Adds limits struct, per-client counters, idle tracking + reaper, and integrates cleanup into session termination/tool calls.
pkg/vmcp/server/server.go Adds config knobs/defaults, global session-limit middleware (503 + Retry-After), and wires idle reaper lifecycle into server start/shutdown.
pkg/vmcp/server/sessionmanager/session_manager_test.go Updates constructors for new Limits parameter and adds tests for per-client limit + idle reaper behavior.
Comments suppressed due to low confidence (1)

pkg/vmcp/server/server.go:607

  • The new session-limit middleware is applied before AuthMiddleware in code, but due to wrapper ordering the auth middleware becomes the outermost handler and runs first. This contradicts the comment and means over-limit requests will still pay auth cost. Apply AuthMiddleware first and then wrap with sessionLimitMiddleware (so the limit check runs before auth).
	// Apply session limit middleware when V2 session management is active.
	// Runs before auth so over-limit requests are rejected early without auth overhead.
	if s.vmcpSessionMgr != nil && s.config.MaxSessions > 0 {
		mcpHandler = s.sessionLimitMiddleware(mcpHandler)
		slog.Info("session limit middleware enabled", "max_sessions", s.config.MaxSessions)
	}

	// Apply authentication middleware if configured (runs first in chain)
	if s.config.AuthMiddleware != nil {
		mcpHandler = s.config.AuthMiddleware(mcpHandler)
		slog.Info("authentication middleware enabled for MCP endpoints")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/vmcp/server/sessionmanager/session_manager.go
Comment thread pkg/vmcp/server/sessionmanager/session_manager.go
Comment thread pkg/vmcp/server/sessionmanager/session_manager.go
Comment thread pkg/vmcp/server/sessionmanager/session_manager_test.go Outdated
Comment thread pkg/vmcp/server/server.go
Comment thread pkg/vmcp/server/server.go
Comment thread pkg/vmcp/server/server.go Outdated
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 68.70748% with 46 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.85%. Comparing base (0de510d) to head (264f539).
⚠️ Report is 135 commits behind head on main.

Files with missing lines Patch % Lines
pkg/vmcp/server/sessionmanager/session_manager.go 76.69% 16 Missing and 8 partials ⚠️
pkg/vmcp/server/server.go 50.00% 13 Missing and 9 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4047      +/-   ##
==========================================
- Coverage   69.13%   68.85%   -0.28%     
==========================================
  Files         462      462              
  Lines       46554    46720     +166     
==========================================
- Hits        32185    32171      -14     
- Misses      11897    11916      +19     
- Partials     2472     2633     +161     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are required for the session v2 work. Would you be okay setting this work aside for now?

I think we'll tackle a related problem as part of the upcoming scale work, where we'll need to implement an LRU for the session storage so not all sessions are held in memory.

Comment thread pkg/vmcp/server/server.go
Comment on lines +631 to +660
// sessionLimitMiddleware is a best-effort fast-fail for new session requests
// (no Mcp-Session-Id header): it returns HTTP 503 + Retry-After before the
// request reaches the SDK when the global session cap appears to be reached.
// Existing sessions (with a valid Mcp-Session-Id) are never affected.
//
// This check is intentionally optimistic (non-atomic): it avoids the overhead
// of routing and SDK processing for clearly-over-limit requests, but it does
// not guarantee strict enforcement under concurrent load. Strict enforcement
// is provided atomically by sessionmanager.Manager.Generate(), which uses an
// increment-first reservation to prevent races between concurrent initialize
// requests.
func (s *Server) sessionLimitMiddleware(next http.Handler) http.Handler {
// Resolve the concrete manager once so we can call ActiveSessionCount().
mgr, _ := s.vmcpSessionMgr.(*sessionmanager.Manager)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Mcp-Session-Id") == "" && mgr != nil {
if mgr.ActiveSessionCount() >= s.config.MaxSessions {
w.Header().Set("Retry-After", strconv.Itoa(defaultRetryAfterSeconds))
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusServiceUnavailable)
_, _ = w.Write([]byte(
`{"error":{"code":-32000,"message":"Maximum concurrent sessions exceeded. ` +
`Please try again later or contact administrator."}}`,
))
return
}
}
next.ServeHTTP(w, r)
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker: can we remove the rate limiting behavior from this PR?

It's not really needed today, so I'd prefer to avoid adding it.

Comment on lines +115 to +128
// perClientMu guards perClientCounts and sessionSubject.
perClientMu sync.Mutex
perClientCounts map[string]int // subject → active session count
sessionSubject map[string]string // sessionID → subject (for decrement on Terminate)

// idleActivityMu guards idleActivity.
idleActivityMu sync.RWMutex
idleActivity map[string]time.Time // sessionID → last active time

// activeSessionCount tracks sessions that have been generated but not yet
// terminated, excluding terminated placeholders left for TTL cleanup.
// This gives an accurate count for global limit enforcement, unlike
// storage.Count() which includes those terminated placeholders.
activeSessionCount atomic.Int64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker: this is adding a lot of complexity to Manager. We should refactor the Manager struct first so we don't have to mix all these concerns together in one class. That would also make unit testing much simpler. I would expect a change like this doesn't require the integration_tests at all.

@yrobla
Copy link
Copy Markdown
Contributor Author

yrobla commented Mar 10, 2026

ok setting as draft for now. I know that is not essential but i was speeding up with pending tasks while the main code was reviewed...

@yrobla yrobla marked this pull request as draft March 10, 2026 15:01
Implements resource-exhaustion protections for the session-scoped
backend lifecycle (SessionManagementV2), resolving issue #3874.

Global session limit: new session requests (no Mcp-Session-Id header)
receive HTTP 503 with a Retry-After header when the server-wide cap is
reached. Default: 100 sessions.

Per-client session limit: CreateSession enforces a maximum number of
concurrent sessions per auth.Identity.Subject. Anonymous clients are
exempt. The counter is rolled back on all failure paths. Default: 10
sessions per identity.

Idle session timeout: a background reaper goroutine terminates sessions
that have had no CallTool activity for longer than the configured
threshold. The idle clock resets on every tool call and is initialised
when a session is fully established. The reaper is wired into
shutdownFuncs so it stops cleanly on server shutdown. Default: 5 min.

All three limits are configurable via Config fields; zero values select
the defaults. The Limits struct is passed to sessionmanager.New() and
all existing call sites are updated.

Closes: #3874
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 16, 2026
@amirejaz
Copy link
Copy Markdown
Contributor

@yrobla could you take a look at this draft PR and confirm if it’s still relevant?

@yrobla
Copy link
Copy Markdown
Contributor Author

yrobla commented Mar 31, 2026

@yrobla could you take a look at this draft PR and confirm if it’s still relevant?

well, we agreed to implement it at some point, but it was not critical. If you don't mind, i prefer to keep them as draft, to resume the work when needed.

@JAORMX
Copy link
Copy Markdown
Collaborator

JAORMX commented Apr 13, 2026

So, I went through this pretty thoroughly and I have some thoughts. First off, the intent here is good... DoS protection for session management is something we'll want eventually. But there are some issues that I think need addressing before this can move forward.

I agree with Jeremy's two blockers, and I found a few more things on top of those.

The Manager is already doing too much

Jeremy flagged this and he's right. The Manager struct today has 4 fields and ~15 methods covering session lifecycle, caching, SDK adaptation, backend notification, and session decoration. That's already borderline. This PR would add three separate synchronization primitives (atomic.Int64, sync.Mutex, sync.RWMutex) each protecting different state subsets, plus a background goroutine. That's our own anti-pattern #3 in .claude/rules/vmcp-anti-patterns.md staring right at us.

The decomposition path is actually pretty clear:

  • Session admission (global + per-client limits) should be its own type with its own lock and tests
  • Idle reaping... we already have a TODO in RestorableCache (cache.go line 56) for exactly this. The cache already has onEvict callbacks. That's the natural home for it, not a new goroutine on Manager

Counter-drift bug in Generate()

OK this one is subtle but real. When storage fails in Generate(), the activeSessionCount decrement is conditional:

if sm.limits.MaxSessions > 0 {
    sm.activeSessionCount.Add(-1)
}

If you incremented the counter, you need to decrement it on failure. Period. Making the rollback conditional on a config value that could theoretically change between the increment and the error path means the counter can drift permanently. Once it drifts, the global limit rejects new sessions forever until restart. The fix is simple... make the decrement unconditional.

Also, the limited path increments before storage (reservation pattern) while the unlimited path increments after. That asymmetry is confusing and easy to break.

Per-client map entries stick around at zero

decrementPerClientCount decrements the count but never deletes the map entry when it hits 0. So every unique subject that ever connected leaves a permanent zero-valued entry in perClientCounts. Not a huge deal, but it's unbounded growth with no eviction. Easy fix.

The middleware splits admission logic across two layers

The sessionLimitMiddleware does an optimistic (non-atomic) check, then Generate() does the real atomic enforcement. This means:

  • If the middleware rejects, fine, early exit
  • If the middleware passes but Generate() rejects, Generate() returns "" and the SDK gets a cryptic failure

Pick one place for the check. Either the middleware owns it entirely (and returns a proper 503), or Generate() owns it entirely. Splitting it is anti-pattern #4 from our own rules.

Missing test coverage

The tests that are here use good patterns (timestamp back-dating instead of time.Sleep, nice). But there's no:

  • Concurrent admission test (N goroutines racing against MaxSessions=M... that's the whole point of the atomic reservation pattern and it's untested)
  • sessionLimitMiddleware test (HTTP 503, Retry-After header, existing sessions pass through)
  • IdleSessionTimeout > SessionTTL clamping test

Codecov shows 68.7% patch coverage with 46 lines missing.

Small thing: wrong change type

The PR checks "Refactoring (no behavior change)" but adding session limits and idle timeout is a new feature. Not a big deal but worth fixing.

Bottom line

I think we should close this one out and reopen it as smaller, focused PRs once we've done the Manager decomposition. Something like:

  1. PR to extract session admission into its own type
  2. PR to implement the RestorableCache idle sweep (the TODO is already there waiting for us)
  3. PR to wire the limits into the server config

That way each piece is testable and reviewable on its own. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vMCP] Implement resource exhaustion and DoS protection for session creation

6 participants